multi-model endpoint
Load Testing SageMaker Multi-Model Endpoints
Productionizing Machine Learning models is a complicated practice. There's a lot of iteration around different model parameters, hardware configurations, traffic patterns that you will have to test to try to finalize a production grade deployment. Load testing is an essential software engineering practice, but also crucial to apply in the MLOps space to see how performant your model is in a real-world setting. How can we load test? A simple yet highly effective framework is the Python package: Locust. Locust can be used in both a vanilla and distributed mode to simulate up to thousands of Transactions Per Second (TPS).
How to scale machine learning inference for multi-tenant SaaS use cases
This post is co-written with Sowmya Manusani, Sr. Staff Machine Learning Engineer at Zendesk Zendesk is a SaaS company that builds support, sales, and customer engagement software for everyone, with simplicity as the foundation. It thrives on making over 170,000 companies worldwide serve their hundreds of millions of customers efficiently. The Machine Learning team at Zendcaesk is responsible for enhancing Customer Experience teams to achieve their best. By combining the power of data and people, Zendesk delivers intelligent products that make their customers more productive by automating manual work. Zendesk has been building ML products since 2015, including Answer Bot, Satisfaction Prediction, Content Cues, Suggested Macros, and many more.
Host multiple TensorFlow computer vision models using Amazon SageMaker multi-model endpoints
Amazon SageMaker helps data scientists and developers prepare, build, train, and deploy high-quality machine learning (ML) models quickly by bringing together a broad set of capabilities purpose-built for ML. SageMaker accelerates innovation within your organization by providing purpose-built tools for every step of ML development, including labeling, data preparation, feature engineering, statistical bias detection, AutoML, training, tuning, hosting, explainability, monitoring, and workflow automation. Companies are increasingly training ML models based on individual user data. For example, an image sharing service designed to enable discovery of information on the internet trains custom models based on each user's uploaded images and browsing history to personalize recommendations for that user. The company can also train custom models based on search topics for recommending images per topic.